Building, Backtesting, and Evaluating Investment Strategies

Lucas S. Macoris (FGV-EAESP)

Outline

This lecture is mainly based the following textbooks:
1. Tidy Finance (Scheuch, Voigt, and Weiss 2023)
2. R for Data Science (Wickham, Mine Cetinkaya-Rundel, and Grolemund 2023)

Coding Replications

For coding replications, whenever applicable, please follow this page or hover on the specific slides with containing coding chunks.

Ensure that you have your session properly set-up according to the instructions outlined in the course webpage
In the webpage, you can also find a detailed discussion of the examples covered in this lecture

Disclaimer

Disclaimer

The information presented in this lecture is for educational and informational purposes only and should not be construed as investment advice. Nothing discussed constitutes a recommendation to buy, sell, or hold any financial instrument or security. Investment decisions should be made based on individual research and consultation with a qualified financial professional. The presenter assumes no responsibility for any financial decisions made based on this content.

All code used in this lecture is publicly available and is also shared on my GitHub page. Participants are encouraged to review, modify, and use the code for their own learning and research purposes. However, no guarantees are made regarding the accuracy, completeness, or suitability of the code for any specific application.

For any questions or concerns, please feel free to reach out via email at lucas.macoris@fgv.br

Analyzing a stock’s performance

From our previous lectures, we have already been able to:
1. Employ methods for analyzing the performance of individual stocks over time
2. Calculate the historical performance of simple combinations of assets, such as equally-weighted portfolios
In reality, however, portfolio construction is way more dynamic and complex. What if we wanted to analyze the performance of a trading strategy that has a dynamic allocation rule?
In this lecture, we will be working with the PerformanceAnalytics and PortfolioAnalytics packages to:

Quantify the performance of a momentum strategy in the Brazilian financial markets
Look at ways for optimizing portfolios based on Markowitz Mean-Variance optimization and its variations

Momentum in the stock market

Do past returns explain future performance? Ideally, that shouldn’t be the case, but…

Definition

Momentum strategies capitalize on the tendency of assets that have performed well in the past to continue performing well, and vice-versa, based on the idea that market trends persist due to behavioral biases (e.g., herding, overreaction, among others).

Previously documented in academic research (Jegadeesh and Titman 1993), such rationale can be applied to stocks, bonds, commodities, and currencies, etc
1. Works across different asset classes and time periods but subject to periodic crashes (e.g., 2009 momentum crash)
2. It is often used alongside factor models like the Fama-French-Carhart Four-Factor Model, which adds a momentum factor to the Fama-French three-factor model

Momentum in the stock market, continued

How can we implement a momentum strategy? In short, there are some key steps in defining the strategy parameters that needed to be taken into consideration:
1. Define the lookback period (e.g., \(3\), \(6\), or \(12\) months) to measure past returns.
2. Rank assets based on past performance
3. Construct a portfolio that goes long on the top-performing assets and eventually short on the worst performers
4. Rebalance periodically (e.g., monthly or quarterly) based on a weighting criteria
5. Calculate the performance metrics
In what follows, we will be looking at a step-by-step guide for implementing such strategy using Brazilian stocks through a hands-on exercise

Hands-On Exercise

You are a quantitative analyst at SeekingAlpha, a quantitative Hedge Fund specialized in automated strategies. Your goal is to quantity how a momentum-based strategy in the Brazilian stock market performed from 2018 to 2024

Instructions

We will be using data from a set of \(\small 20\) selected Brazilian stocks that have been traded over the study period
Our portfolio will be rebalanced monthly using a lookback period of 90 days, selecting the top \(\small5\) stocks based on the adjusted prices
The strategy will assign equal weights to all stocks and will consist of a long-biased strategy - i.e, you will only buy stocks

The selected stocks are: RAIZ4, ITUB3, IRBR3, BBDC4, ABEV3, YDUQ3, BBAS3, B3SA3, WEGE3, RADL3, LREN3, BRFS3, CSAN3, HAPV3, SUZB3, GRND3, MGLU3, BEEF3, EGIE3, and HYPE3

Would you recommend investing in a fund that replicates this strategy?

You can use the previous tq_get() function from the tidyquant package to collect and retrieve the data from all tickers during the study period, already in tidy format for you to manipulate:

# Define a list of Brazilian stocks (tickers)
br_stocks <- c("RAIZ4.SA", "ITUB3.SA", "IRBR3.SA", "BBDC4.SA", "ABEV3.SA", 
               "YDUQ3.SA", "BBAS3.SA", "B3SA3.SA", "WEGE3.SA", "RADL3.SA", 
               "LREN3.SA", "BRFS3.SA", "CSAN3.SA", "HAPV3.SA", "SUZB3.SA", 
               "GRND3.SA", "MGLU3.SA", "BEEF3.SA", "EGIE3.SA", "HYPE3.SA")

# Get stock price data
prices <- tq_get(br_stocks, from = "2018-01-01", to = "2024-01-01")

# A tibble: 28,773 × 8
   symbol   date        open  high   low close   volume adjusted
   <chr>    <date>     <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
 1 RAIZ4.SA 2021-08-05  7.48  7.60  7.18  7.24 98849900     6.38
 2 RAIZ4.SA 2021-08-06  7.25  7.35  7.03  7.10 28799900     6.26
 3 RAIZ4.SA 2021-08-09  7.18  7.31  7.07  7.07 14491100     6.23
 4 RAIZ4.SA 2021-08-10  7.12  7.16  7.05  7.10  9988600     6.26
 5 RAIZ4.SA 2021-08-11  7.13  7.14  6.80  6.87 31978000     6.06
 6 RAIZ4.SA 2021-08-12  6.97  6.98  6.75  6.96 17054100     6.14
 7 RAIZ4.SA 2021-08-13  6.98  7.04  6.86  7.01 21141500     6.18
 8 RAIZ4.SA 2021-08-16  6.97  7.12  6.78  7.02 13434600     6.19
 9 RAIZ4.SA 2021-08-17  6.93  6.99  6.76  6.82 17772000     6.01
10 RAIZ4.SA 2021-08-18  6.83  7     6.72  6.81  8841700     6.01
# ℹ 28,763 more rows

Step 2: Define the Trading Signals

After collecting the data, we need to make sure that our momentum strategy accurately creates the trading signals and use them at an appropriate timestamp:
1. We use the lag() function to calculate, for each symbol, the adjusted trading price back in \(\small 60\) days
2. After that, we calculate the returns - in this case, we will be using log-returns
3. With the 3-month returns properly calculated, we rank stocks in each date from highest-to-lowest returns, keeping only the five highest returns for each given date
4. Finally, we create a new date column that stores the week where we will be evaluating the returns from investing in the top 5 stocks

Step 2: Define the Trading Signals

Code
Output

# Create the rankings
ranks <- prices%>%
  #Select relevant columns
  select(symbol, date, adjusted)%>%
  #Rename price for conciseness
  rename(price = adjusted)%>%
  #Group by symbol
  group_by(symbol)%>%
  #Reorder by symbol and dat
  arrange(symbol, date)%>%
  #Get the price from the last 60th trading day (roughly a 3-month lag)
  mutate(lagged_price = lag(price, 60))%>%  
  mutate(momentum_return = log(price/lagged_price))%>% #Using continuously compounded returns
  drop_na()%>% #Remove any NAs
  group_by(year(date),month(date),symbol)%>%
  slice_head(n=1)%>%
  mutate(rank_date = floor_date(date, "month"))%>% # Set the date where we will apply the momentum strategy 
  group_by(rank_date)%>%
  mutate(rank = rank(desc(momentum_return)))%>%
  filter(rank <= 5)%>% #Get only the top 5 stocks in terms of momentum
  mutate(date = rank_date + months(1))%>% # Apply ranking next month
  ungroup()%>%
  select(rank,date,symbol)

# A tibble: 10 × 3
    rank date       symbol  
   <dbl> <date>     <chr>   
 1     1 2018-04-01 SUZB3.SA
 2     2 2018-04-01 BBAS3.SA
 3     3 2018-04-01 IRBR3.SA
 4     4 2018-04-01 MGLU3.SA
 5     5 2018-04-01 ITUB3.SA
 6     1 2018-05-01 SUZB3.SA
 7     2 2018-05-01 MGLU3.SA
 8     3 2018-05-01 IRBR3.SA
 9     4 2018-05-01 BBAS3.SA
10     5 2018-05-01 ITUB3.SA

Step 3: Compute Returns

Code
Output

# Compute monthly returns
returns <- prices%>%
  group_by(symbol)%>%
  tq_transmute(select = adjusted,
               mutate_fun = monthlyReturn,
               col_rename = 'monthly_return')%>%
  mutate(date = floor_date(date, "month"))%>%
  select(symbol, date, monthly_return)

# A tibble: 10 × 3
# Groups:   symbol [1]
   symbol   date       monthly_return
   <chr>    <date>              <dbl>
 1 RAIZ4.SA 2021-08-01      -0.0193  
 2 RAIZ4.SA 2021-09-01       0       
 3 RAIZ4.SA 2021-10-01      -0.0429  
 4 RAIZ4.SA 2021-11-01      -0.195   
 5 RAIZ4.SA 2021-12-01       0.181   
 6 RAIZ4.SA 2022-01-01      -0.000853
 7 RAIZ4.SA 2022-02-01      -0.111   
 8 RAIZ4.SA 2022-03-01       0.231   
 9 RAIZ4.SA 2022-04-01      -0.0146  
10 RAIZ4.SA 2022-05-01      -0.112

Step 4: Portfolio Performance

Finally, it is time to find out how did your strategy perform over time
1. For that, you will match returns to ranks for each symbol and date combination
2. Because we want to make sure that we are only looking at the specific breakpoint dates for the trading strategy, we use the inner_join() function
With that, we are able to collect, for each beginning-of-month, what was the return from the strategy up to that point!

Step 4: Portfolio Performance

Code
Output

# Merge momentum ranking with next month’s returns
strategy_LO <- inner_join(returns, ranks, by = c("symbol", "date"))%>%
  arrange(date, rank)%>%
  group_by(date)%>%
  summarize(portfolio_return = mean(monthly_return, na.rm = TRUE))

# Compute cumulative returns
strategy_LO%>%
  mutate(cumulative_return = cumprod(1 + portfolio_return) - 1)%>%
# Plot results
ggplot(aes(x = date, y = cumulative_return)) +
    geom_line(color = "blue",size=2) +
    geom_hline(yintercept=0,linetype='dashed')+
    labs(title = "Long-biased Momentum Portfolio Performance",
         subtitle = 'Selecting the top 5 stocks in terms of past returns, lookback period of 60 trading days, and monthly rebalancing',
         y = "Cumulative Returns",
         x = '')+
    #Scales
    scale_y_continuous(labels = percent,breaks=seq(-0.5,1.5,0.1))+
    scale_x_date(date_breaks = '3 months')+
    #Custom 'TidyQuant' theme
    theme_minimal()+
    #Adding further customizations
    theme(legend.position='none',
          axis.title = element_text(face='bold',size=15),
          axis.text = element_text(size=10),
          axis.text.x = element_text(angle=90),
          plot.title = element_text(size=20,face='bold'),
          plot.subtitle  = element_text(size=15))

Going long and short

Our previous momentum strategy was defined as a long-biased strategy: we only selected the top 5 stocks in terms of past returns and built a long position
A potential risk in this strategy is that, even if those performed well in the past, there is a downward trend in the market
We can add a little bit more complexity to our momentum based strategy by allowing long and short positions at the same time:
1. Calculate the rolling 3-month returns as before
2. Select the top and worst performing firms in each period
3. Build a long position on the top firms, and a short position in the worst firms
4. Recalculate metrics
In what follows, we will do small tweaks in our ranking definition so as to employ a long-short momentum strategy

(Re)define the Trading Signals

Code
Output

# Create the rankings
ranks <- prices%>%
  #Select relevant columns
  select(symbol, date, adjusted)%>%
  #Rename price for conciseness
  rename(price = adjusted)%>%
  #Group by symbol
  group_by(symbol)%>%
  #Reorder by symbol and dat
  arrange(symbol, date)%>%
  #Get the price from the last 60th trading day (roughly a 3-month lag)
  mutate(lagged_price = lag(price, 60))%>%  
  mutate(momentum_return = log(price/lagged_price))%>% #Using continuously compounded returns
  drop_na()%>% #Remove any NAs
  group_by(year(date),month(date),symbol)%>%
  slice_head(n=1)%>%
  mutate(rank_date = floor_date(date, "month"))%>% # Set the date where we will apply the momentum strategy 
  group_by(rank_date)%>%
  mutate(rank = rank(desc(momentum_return)))%>%
  filter(rank %in% c(1:5,15:20))%>% #Get only the top 5 stocks in terms of momentum
  mutate(date = rank_date + months(1))%>% # Apply ranking next month
  ungroup()%>%
  select(symbol, date, rank)

# A tibble: 10 × 3
   symbol   date        rank
   <chr>    <date>     <dbl>
 1 SUZB3.SA 2018-04-01     1
 2 BBAS3.SA 2018-04-01     2
 3 IRBR3.SA 2018-04-01     3
 4 MGLU3.SA 2018-04-01     4
 5 ITUB3.SA 2018-04-01     5
 6 WEGE3.SA 2018-04-01    15
 7 RADL3.SA 2018-04-01    16
 8 BEEF3.SA 2018-04-01    17
 9 BRFS3.SA 2018-04-01    18
10 SUZB3.SA 2018-05-01     1

Setting up a long and short momentum strategy

Code
Output

strategy_LS <- inner_join(returns, ranks, by = c("symbol", "date"))%>%
  arrange(date, rank)%>%
  mutate(monthly_return=ifelse(rank>=15,-1*monthly_return,monthly_return))%>%
  group_by(date)%>%
  summarize(portfolio_return = mean(monthly_return, na.rm = TRUE))

# Compute cumulative returns
strategy_LS%>%
  mutate(cumulative_return = cumprod(1 + portfolio_return) - 1)%>%
# Plot results
  ggplot(aes(x = date, y = cumulative_return)) +
    geom_line(color = "red",size=2) +
    geom_hline(yintercept=0,linetype='dashed')+
    labs(title = "Long-Short Momentum Portfolio Performance",
         subtitle = 'Selecting the top/worst 5 stocks in terms of past returns, lookback period of 60 trading days, and monthly rebalancing',
         y = "Cumulative Returns",
         x = '')+
    #Scales
    scale_y_continuous(labels = percent,breaks=seq(-0.5,1.5,0.1))+
    scale_x_date(date_breaks = '3 months')+
    #Custom 'TidyQuant' theme
    theme_minimal()+
    #Adding further customizations
    theme(legend.position='none',
          axis.title = element_text(face='bold',size=15),
          axis.text = element_text(size=10),
          axis.text.x = element_text(angle=90),
          plot.title = element_text(size=20,face='bold'),
          plot.subtitle  = element_text(size=15))

Putting all together

Code
Output

chart_strategy_LO=strategy_LO%>%mutate(Type='Long-biased')
chart_strategy_LS=strategy_LS%>%mutate(Type='Long-Short')

chart_strategy_LO%>%
  rbind(chart_strategy_LS)%>%
  group_by(Type)%>%
  # Compute cumulative returns
  mutate(cumulative_return = cumprod(1 + portfolio_return) - 1)%>%
# Plot results
  ggplot(aes(x = date, y = cumulative_return,group=Type,col=Type)) +
    geom_line(size=2) +
    geom_hline(yintercept=0,linetype='dashed')+
    labs(title = "Momentum Strategies",
         subtitle = 'Both long-biased and long-short strategies.',
         y = "Cumulative Returns",
         x = '')+
    #Scales
    scale_y_continuous(labels = percent,breaks=seq(-0.5,1.5,0.1))+
    scale_x_date(date_breaks = '3 months')+
    #Custom 'TidyQuant' theme
    theme_minimal()+
    #Adding further customizations
    theme(legend.position='bottom',
          axis.title = element_text(face='bold',size=15),
          axis.text = element_text(size=10),
          axis.text.x = element_text(angle=90),
          plot.title = element_text(size=20,face='bold'),
          plot.subtitle  = element_text(size=15))

Assessing portfolio performance

There is so much you can do with the PerformanceAnalytics package - convert your newly created strategy_* objects into an .xts object by calling portfolio%>%as.xts() and experiment using these pre-built functions for assessing historical performance:
1. charts.RollingPerformance()
2. charts.PerformanceSummary()
3. chart.Histogram()
4. table.AnnualizedReturns()
Play around with changing the parameters of the strategy and document the effects of such changes!

Choosing Efficient Portfolios

Now that we understood how to analyze the performance of single stocks, let’s turn our attention to determine how an investor can analyze a portfolio of assets!
Let’s start of with the simplest case: create a portfolio with two stocks, Amazon and Ferrari. Previously, we’ve shown that, for the analysis period, we had the following results in terms of risk and return:

Let’s create \(5\) different portfolios using \(\pm20\%\) of allocation weights in each asset

Choosing an Efficient Portfolio, continued

We can plot this in a figure to show all possible risk \(\times\) return combinations

Choosing an Efficient Portfolio, 6 portfolios

Choosing an Efficient Portfolio, 11 portfolios

Choosing an Efficient Portfolio, >100 portfolios

Choosing an Efficient Portfolio

As a financial manager, one crucial job you have is to find portfolios that are not sub-optimal
1. For a given level of volatility, they deliver the highest possible return
2. Alternatively, for a given level of return, they deliver the lowest possible volatility
An easy what to look at this is to identify the minimum variance portfolio (MVP)
1. This portfolio is, among all combinations, the one with the lowest volatility
2. From there, if a given portfolio is riskier than the MVP, it needs to deliver higher returns!
3. On the other hand, if a portfolio is riskier than the MVP and deliver the same/lower returns, it can be considered inefficient

\(\rightarrow\) In other words: investors should look only for efficient portfolios and will choose based on his specific preferences for risk!

The Efficient Frontier

In our example, we used only two assets. What happens when we increase the number of potential assets?
Let’s replicate the same rationale by now investing our money in three possible stocks: Amazon,Ferrari, and VSCO

Which of these portfolios are efficient?

Choosing an Efficient Portfolio

What happens when you continuously increase the number of assets?
1. If you add stocks, you improve the frontier - i.e, you are able to create portfolios that span better options in terms of risk and return
2. If you continue adding assets, you will have what is called Efficient Frontier
The Efficient Frontier is the set of portfolios where:
1. For a given level of volatility, you have the highest possible return among all portfolios with the same volalitity level
2. For a given level of return, you have the lowest possible volatility among all portfolios with the same return level
Based on this, is there a single portfolio in which all investors should hold? No! In practice, investors will choose among portfolios based on their specific preferences for risk and return

The Markowitz Mean-Variance Problem

We generalize the problem of finding the efficient portfolio for a given \(N\) number of assets as:

\[ \min_{\{w_1,w_2,w_3,w_4,w_5\}} \sigma^2_p = w^T \Sigma w, \text{ such that:} \\ \begin{cases} \sum_{i=1}^5 w_i=1 \text{ (1)}\\ 0\leq w_i \leq 1, \forall i \text{ (2)} \end{cases} \]

This is also known as the Markowtiz Mean-Variance optimization problem for a long-only portfolio

The Markowitz Mean-Variance Problem, continued

In other words, you are solving the following problem: find the set of allocation weights \(w_1,w_2,w_3,w_4\), and \(w_5\) (the % that you allocate in each of the five stocks) that, when used to create a portfolio, are the ones that create the portfolio with the minimum variance among all possible combinations
In order to do that, you have to ensure that the weights add up to 100% (so you’re fully investing your capital), which is the first condition. The second conditions states that a stock cannot have negative weights nor have a weight that is greater than 100%

Important: Note that \(w^T\Sigma w\) is nothing more than the matrix form of \(\sum_{i=1}^{N}w_i\sigma_i^2+ 2\sum_{i=1}^{N}\sum_{j\neq i}w_i w_j\sigma_{i,j}\), which is the variance of a portfolio that consists of \(N\) assets

References

Jegadeesh, Narasimhan, and Sheridan Titman. 1993. “Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency.” The Journal of Finance 48 (1): 65–91. https://doi.org/10.1111/j.1540-6261.1993.tb04702.x.

Scheuch, Christoph, Stefan Voigt, and Patrick Weiss. 2023. Tidy Finance with R. Chapman & Hall/CRC. https://www.tidy-finance.org/r/.

Wickham, Hadley, Mine Cetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science. O’Reilly Media. https://r4ds.had.co.nz/.